Process Mining
   HOME

TheInfoList



OR:

Process mining is a family of techniques relating the fields of data science and process management to support the analysis of operational processes based on event logs. The goal of process mining is to turn event data into insights and actions. Process mining is an integral part of data science, fueled by the availability of event data and the desire to improve processes. Process mining techniques use event data to show what people, machines, and organizations are really doing. Process mining provides novel insights that can be used to identify the executional path taken by operational processes and address their performance and compliance problems. Process mining starts from event data. Input for process mining is an event log. An event log views a process from a particular angle. Each event in the log should contain (1) a unique identifier for a particular process instance (called case id), (2) an activity (description of the event that is occurring), and (3) a timestamp. There may be additional event attributes referring to resources, costs, etc., but these are optional. With some effort, such data can be extracted from any information system supporting operational processes. Process mining uses these event data to answer a variety of process-related questions. There are three main classes of process mining techniques: ''process discovery'', ''conformance checking'', and ''process enhancement''. In the past terms like ''Workflow Mining'' and ''Automated Business Process Discovery'' (ABPD) were used.


Overview

Process mining techniques are often used when no formal description of the process can be obtained by other approaches, or when the quality of existing documentation is questionable. For example, application of process mining methodology to the audit trails of a
workflow management system A workflow management system (WfMS or WFMS) provides an infrastructure for the set-up, performance and monitoring of a defined sequence of tasks, arranged as a workflow application. International standards There are several international standards ...
, the transaction logs of an
enterprise resource planning Enterprise resource planning (ERP) is the integrated management of main business processes, often in real time and mediated by software and technology. ERP is usually referred to as a category of Business management tools, business management ...
system, or the
electronic patient record An electronic health record (EHR) is the systematized collection of patient and population electronically stored health information in a digital format. These records can be shared across different health care settings. Records are shared throu ...
s in a hospital can result in models describing processes of organizations. Event log analysis can also be used to compare event logs with ''
prior Prior (or prioress) is an ecclesiastical title for a superior in some religious orders. The word is derived from the Latin for "earlier" or "first". Its earlier generic usage referred to any monastic superior. In abbeys, a prior would be l ...
'' model(s) to understand whether the observations conform to a prescriptive or descriptive model. It is required that the event logs data be linked to a case ID, activities, and timestamps. Contemporary management trends such as BAM (
Business Activity Monitoring Business activity monitoring (BAM) is software that aids the monitoring of business activities which are implemented in computer systems. The term was originally coined by analysts at Gartner, Inc. and refers to the aggregation, analysis, and pr ...
), BOM ( Business Operations Management), and BPI ( business process intelligence) illustrate the interest in supporting diagnosis functionality in the context of
Business Process Management Business process management (BPM) is the discipline in which people use various methods to discover, model, analyze, measure, improve, optimize, and automate business processes. Any combination of methods used to manage a company's business p ...
technology (e.g.,
Workflow Management System A workflow management system (WfMS or WFMS) provides an infrastructure for the set-up, performance and monitoring of a defined sequence of tasks, arranged as a workflow application. International standards There are several international standards ...
s and other ''process-aware'' information systems). Process mining is different from mainstream
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, data mining, and
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
techniques. For example, process discovery techniques in the field of process mining try to discover end-to-end process models that are able to describe sequential, choice relation, concurrent and loop behavior. Conformance checking techniques are closer to
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
than to traditional learning approaches. However, process mining can be used to generate
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, data mining, and
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
problems. After discovering a process model and aligning the event log, it is possible to create basic supervised and unsupervised learning problems. For example, to predict the remaining processing time of a running case or to identify the root causes of compliance problems. The IEEE Task Force on Process Mining was established in October 2009 as part of the IEEE Computational Intelligence Society. This is a vendor-neutral organization aims to promote the research, development, education and understanding of process mining, make end-users, developers, consultants, and researchers aware of the state-of-the-art in process mining, promote the use of process mining techniques and tools and stimulate new applications, play a role in standardization efforts for logging event data (e.g., XES), organize tutorials, special sessions, workshops, competitions, panels, and develop material (papers, books, online courses, movies, etc.) to inform and guide people new to the field. The IEEE Task Force on Process Mining established the International Process Mining Conference (ICPM) series, lead the development of the IEEE XES standard for storing and exchanging event data , and wrote the Process Mining Manifesto which was translated into 16 languages.


History and place in data science

The term "Process mining" was first coined in a research proposal written by the Dutch computer scientist
Wil van der Aalst Willibrordus Martinus Pancratius van der Aalst (born 29 January 1966) is a Dutch computer scientist and full professor at RWTH Aachen University, leading the Process and Data Science (PADS) group. His research and teaching interests include i ...
("Godfather of Process mining"). Thus began a new field of research that emerged under the umbrella of techniques related to data science and process science at the Eindhoven University in 1999. In the early days, process mining techniques were often convoluted with the techniques used for workflow management. In the year 2000, the very first practically applicable algorithm for process discovery, " Alpha miner" was developed. The very next year, in 2001, a much similar algorithm based on heuristics called
Heuristic miner
was introduced in the research papers. Further along the link more powerful algorithms such as inductive miner were developed for process discovery. As the field of process mining began to evolve, conformance checking became an integral part of it. The year 2004 earmarked the development of " Token-based replay" for conformance checking purposes. Apart from the mainstream techniques of process discovery and conformance checking, process mining branched out into multiple areas leading to the discovery and development of " Performance analysis",
Decision mining
and " Organizational mining" in the year 2005 and 2006 respectively. In the year 2007, the first-ever commercial process mining company "Futura Pi" was established. The
IEEE task force on PM
, a governing body was formed in the year 2009 that began to overlook the norms and standards related to process mining. Further techniques were developed for conformance checking which led to the publishing of
Alignment-based conformance checking
in the year 2010. In 2011, the first-ever Process mining book was published. Further along in 2014, a MOOC course was offered by
Coursera Coursera Inc. () is a U.S.-based massive open online course provider founded in 2012 by Stanford University computer science professors Andrew Ng and Daphne Koller. Coursera works with universities and other organizations to offer online courses ...
on Process mining. By the year 2018, nearly 30+ commercially available process mining tools were in the picture. The year 2019 earmarked the first Process mining conference. Today we have over 35 vendors offering tools and techniques for process discovery and conformance checking. Process mining should be viewed as a bridge between data science and process science. Process mining focuses on transforming event log into a meaningful representation of the process which can lead to the formation of several data science and machine learning related problems.


Categories

There are three categories of process mining techniques. * '' Process Discovery'': The first step in process mining. The main goal of process discovery is to transform the event log into a process model. An event log can come from any data storage system that records the activities in an organisation along with the timestamps for those activities. Such an event log is required to contain a case id (a unique identifier to recognise the case to which activity belongs), activity description (a textual description of the activity executed), and timestamp of the activity execution. The result of process discovery is generally a process model which is representative of the event log. Such a process model can be discovered, for example, using techniques such as
alpha algorithm The α-algorithm or α-miner is an algorithm used in process mining, aimed at reconstructing causality from a set of sequence of events, sequences of events. It was first put forward by Wil van der Aalst, van der Aalst, Weijters and Măruşter. The ...
(a didactically driven approach)
heuristic miner
or inductive miner. Aalst, W. van der, Weijters, A., & Maruster, L. (2004). Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering, 16 (9), 1128–1142. Many established techniques exist for automatically constructing process models (for example,
Petri nets A Petri net, also known as a place/transition (PT) net, is one of several mathematical modeling languages for the description of distributed systems. It is a class of discrete event dynamic system. A Petri net is a directed bipartite graph that ...
, BPMN diagrams, activity diagrams, State diagrams, and EPCs) based on an event log. Recently, process mining research has started targeting other perspectives (e.g., data, resources, time, etc.). One example is the technique described in (Aalst, Reijers, & Song, 2005), which can be used to construct a social network. Now a days, techniques such as "streaming process mining" are being developed to work with continuous online data that has to be processed on the spot. * '' Conformance checking'': Helps in comparing an event log with an existing process model to analyse the discrepancies between them. Such a process model can be constructed manually or with the help of a discovery algorithm. For example, a process model may indicate that purchase orders of more than 1 million euros require two checks. Another example is the checking of the so-called "four-eyes" principle. Conformance checking may be used to detect deviations (compliance checking), or evaluate the discovery algorithms, or enrich an existing process model. An example is the extension of a process model with performance data, i.e., some ''a priori'' process model is used to project the potential bottlenecks. Another example is the ''decision miner'' described in (Rozinat & Aalst, 2006b),Rozinat, A., & Aalst, W. van der (2006a). Conformance Testing: Measuring the Fit and Appropriateness of Event Logs and Process Models. In C. Bussler et al. (Ed.), BPM 2005 Workshops (Workshop on Business Process Intelligence) (Vol. 3812, pp. 163–176). Springer-Verlag, Berlin. which takes an ''a priori'' process model and analyses every choice in the process model. The event log is consulted for each option to see which information is typically available the moment the choice is made. Conformance checking has various techniques such as " token-based replay", " streaming conformance checking" that are used depending on the system needs.Then classical data mining techniques are used to see which data elements influence the choice. As a result, a decision tree is generated for each choice in the process. * ''Performance Analysis'': Used when there is an ''a priori'' model. The model is extended with additional performance information such as processing times, cycle times, waiting times, costs, etc., so that the goal is ''not'' to check conformance, but rather to improve the performance of the existing model with respect to certain process performance measures. An example is the extension of a process model with performance data, i.e., some prior process model dynamically annotated with performance data. It is also possible to extend process models with additional information such as decision rules and organisational information (e.g., roles).


See also

*
Business Process Management Business process management (BPM) is the discipline in which people use various methods to discover, model, analyze, measure, improve, optimize, and automate business processes. Any combination of methods used to manage a company's business p ...
* Process Discovery * Conformance Checking * Workflow Management *
Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
* Data Science *
Sequence mining Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time serie ...
* Data mining * Intention mining *
Data visualization Data and information visualization (data viz or info viz) is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is num ...
*
Process analysis Process analysis is a form of technical writing and expository writing The rhetorical modes (also known as modes of discourse) are a long-standing attempt to broadly classify the major kinds of language-based communication, particularly writing a ...


References


Further reading

* Aalst, W. van der (2016). Process Mining: Data Science in Action. Springer Verlag, Berlin (). * Reinkemeyer, L. (2020). Process Mining in Action: Principles, Use Cases and Outlook. Springer Verlag, Berlin (). * Carmona, J., van Dongen, B.F., Solti, A., Weidlich, M. (2018). Conformance Checking: Relating Processes and Models. Springer Verlag, Berlin (). * Aalst, W. van der (2011). Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer Verlag, Berlin (). * Aalst, W. van der, Dongen, B. van, Herbst, J., Maruster, L., Schimm, G., & Weijters, A. (2003). Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering, 47 (2), 237–267. * Aalst, W. van der, Reijers, H., & Song, M. (2005). Discovering Social Networks from Event Logs. Computer Supported Cooperative work, 14 (6), 549–593. * Jans, M., van der Werf, J.M., Lybaert, N., Vanhoof, K. (2011) A business process mining application for internal transaction fraud mitigation, Expert Systems with Applications, 38 (10), 13351–13359 * Dongen, B. van, Medeiros, A., Verbeek, H., Weijters, A., & Aalst, W. van der (2005). The ProM framework: A New Era in Process Mining Tool Support. In G. Ciardo & P. Darondeau (Eds.), Application and Theory of Petri Nets 2005 (Vol. 3536, pp. 444–454). Springer-Verlag, Berlin. * Aalst, W. van der. A Practitioner's Guide to Process Mining: Limitations of the Directly-Follows Graph. In International Conference on Enterprise Information Systems (Centeris 2019), volume 164 of Procedia Computer Science, pages 321-328. Elsevier, 2019. * Grigori, D., Casati, F., Castellanos, M., Dayal, U., Sayal, M., & Shan, M. (2004). Business Process Intelligence. Computers in Industry, 53 (3), 321–343. * Grigori, D., Casati, F., Dayal, U., & Shan, M. (2001). Improving Business Process Quality through Exception Understanding, Prediction, and Prevention. In P. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, & R. Snodgrass (Eds.), Proceedings of 27th international conference on Very Large Data Bases (VLDB’01) (pp. 159–168). Morgan Kaufmann. * IDS Scheer. (2002). ARIS Process Performance Manager (ARIS PPM): Measure, Analyze and Optimize Your Business Process Performance (whitepaper). * Ingvaldsen, J.E., & J.A. Gulla. (2006). Model Based Business Process Mining. Journal of Information Systems Management, Vol. 23, No. 1, Special Issue on Business Intelligence, Auerbach Publications * Kirchmer, M., Laengle, S., & Masias, V. (2013). Transparency-Driven Business Process Management in Healthcare Settings eading Edge Technology and Society Magazine, IEEE, 32(4), 14-16. * zur Muehlen, M. (2004). Workflow-based Process Controlling: Foundation, Design and Application of workflow-driven Process Information Systems. Logos, Berlin. * zur Muehlen, M., & Rosemann, M. (2000). Workflow-based Process Monitoring and Controlling – Technical and Organizational Issues. In R. Sprague (Ed.), Proceedings of the 33rd Hawaii international conference on system science (HICSS-33) (pp. 1–10). IEEE Computer Society Press, Los Alamitos, California. * Rozinat, A., & Aalst, W. van der (2006b). Decision Mining in ProM. In S. Dustdar, J. Faideiro, & A. Sheth (Eds.), International Conference on Business Process Management (BPM 2006) (Vol. 4102, pp. 420–425). Springer-Verlag, Berlin. * Sayal, M., Casati, F., Dayal, U., & Shan, M. (2002). Business Process Cockpit. In Proceedings of 28th international conference on very large data bases (VLDB’02) (pp. 880–883). Morgan Kaufmann. * Huser V, Starren JB, EHR Data Pre-processing Facilitating Process Mining: an Application to Chronic Kidney Disease. AMIA Annu Symp Proc 200
link
* Ross-Talbot S, The importance and potential of descriptions to our industry. Keynote at The 10th International Federated Conference on Distributed Computing Technique

* Garcia, Cleiton dos Santos; Meincheim, Alex; et al. (2019). Process mining techniques and applications – A systematic mapping study». Expert Systems with Applications. 133: 260–295. ISSN 0957-4174. doi:10.1016/j.eswa.2019.05.00

* van der Aalst, W.M.P. and Berti A. Discovering Object-Centric Petri Nets. Fundamenta Informaticae, 175(1-4):1-40, 2020.


External links


International Process Mining Conference
is the annual international process mining conference organized by the IEEE Task Force on Process Mining.
Process mining research
at Eindhoven University of Technology, the Netherlands.
Process mining research
at Ghent University, Belgium.
Process mining research
at
University of Padua The University of Padua ( it, Università degli Studi di Padova, UNIPD) is an Italian university located in the city of Padua, region of Veneto, northern Italy. The University of Padua was founded in 1222 by a group of students and teachers from ...
, Italy. {{DEFAULTSORT:Process Mining